Integrative analysis of KRAS wildtype metastatic pancreatic ductal adenocarcinoma reveals mutation and expression-based similarities to cholangiocarcinoma

Oncogenic KRAS mutations are absent in approximately 10% of patients with metastatic pancreatic ductal adenocarcinoma (mPDAC) and may represent a subgroup of mPDAC with therapeutic options beyond standard-of-care cytotoxic chemotherapy. While distinct gene fusions have been implicated in KRAS wildtype mPDAC, information regarding other types of mutations remain limited, and gene expression patterns associated with KRAS wildtype mPDAC have not been reported. Here, we leverage sequencing data from the PanGen trial to perform comprehensive characterization of the molecular landscape of KRAS wildtype mPDAC and reveal increased frequency of chr1q amplification encompassing transcription factors PROX1 and NR5A2. By leveraging data from colorectal adenocarcinoma and cholangiocarcinoma samples, we highlight similarities between cholangiocarcinoma and KRAS wildtype mPDAC involving both mutation and expression-based signatures and validate these findings using an independent dataset. These data further establish KRAS wildtype mPDAC as a unique molecular entity, with therapeutic opportunities extending beyond gene fusion events.

The exact sample size (n) for each experimental group/condition, given as as a discrete number and unit of of measurement A statement on on whether measurements were taken from distinct samples or or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.
A description of of all covariates tested A description of of any assumptions or or corrections, such as as tests of of normality and adjustment for multiple comparisons A full description of of the statistical parameters including central tendency (e.g. means) or or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or or associated estimates of of uncertainty (e.g. confidence intervals) For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of of freedom and P value noted Give P values as exact values whenever suitable.

For Bayesian analysis, information on on the choice of of priors and Markov chain Monte Carlo settings
For hierarchical and complex designs, identification of of the appropriate level for tests and full reporting of of outcomes Estimates of of effect sizes (e.g. Cohen's d, Pearson's r), ), indicating how they were calculated Our web collection on statistics for biologists contains articles on many of the points above.

Software and code
Policy information about availability of of computer code Data collection Data analysis For manuscripts utilizing custom algorithms or or software that are central to to the research but not yet described in in published literature, software must be be made available to to editors and reviewers. We We strongly encourage code deposition in in a community repository (e.g. GitHub). See the Nature Portfolio guidelines for submitting code & software for further information.

March 2021
Data Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: -Accession codes, unique identifiers, or web links for publicly available datasets -A description of any restrictions on data availability -For clinical datasets or third party data, please ensure that the statement adheres to our policy Field-specific reporting Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection.

Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences
For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf

Life sciences study design
All studies must disclose on these points even when the disclosure is negative.

Sample size
Data exclusions

Blinding
Behavioural & social sciences study design All studies must disclose on these points even when the disclosure is negative.

Study description
Research sample Sampling strategy

Data collection
Genomic data generated within the PanGen/POG and COMPASS studies are actively submitted to the European Genome-phenome Archive (EGA) under accession numbers #EGAS00001001159 (https://ega-archive.org/studies/EGAS00001001159) and #EGAS00001002543 (https://ega-archive.org/studies/EGAS00001002543), respectively. Data uploaded to EGA as part of the POG/PanGen study, including raw RNA and whole-genome sequencing files, will be made available to interested researchers while respecting patient privacy, and can be accessed through the BC Cancer Data Access Committee (https://ega-archive.org/dacs/EGAC00000000011; email address: tdoadmin@phsa.ca), which provides responses within three to five business days. Upon establishment and signing of the data transfer agreement, EGA data release can be expected within three business days. Once access has been granted, the period during which the data can be downloaded is flexible according to the downloader's needs. Data access through EGA is on a limited use and project-specific basis. These data are available under restricted access in accordance with the ethical data regulations followed by the POG and PanGen trials. Hartwig data were accessed through the Hartwig Medical Foundation database (https://www.hartwigmedicalfoundation.nl/data/databank/). The UniProt Human proteome is available from https://www.uniprot.org/proteomes/UP000005640. Processed VTCN1 and PROX1 protein level data are included as Supplementary Data 7. Raw protein data are available in the Proteomics Identifications Database (PRIDE) under accession number PXD036632. Source data are provided in this paper. The remaining data are available within the Article, Supplementary Information or Source Data file.
The PanGen cohort consisted of 63 patients with metastatic pancreatic ductal adenocarcinoma. No sample size calculation was performed, and sample size was determined by using the maximum number of samples available at the time of manuscript preparation.
No data were excluded from any cohorts.
Validation cohorts consisted of COMPASS (n=195; metastatic PDAC), and Hartwig (n=113; metastatic PDAC) external datasets. Samples from patients enrolled in the POG trial were also used in the analysis, and included patients with metastatic colorectal adenocarcinoma (n=63) and metastatic cholangiocarcinoma (n=14). Cholangiocarcinoma samples from the Hartwig Foundation (n=25) were also used in validation analysis.
Samples in this study were not randomized into individual groups. Randomization was not applicable to this study as the groups of interest were retrospective and based solely on the presence of an oncogenic KRAS mutation in the tumor.
Blinding was not performed for this study. Blinding was not applicable to this study as the groups of interest were retrospective and based solely on the presence of an oncogenic KRAS mutation in the tumor.
Briefly describe the study type including whether data are quantitative, qualitative, or mixed-methods (e.g. qualitative cross-sectional, quantitative experimental, mixed-methods case study).
State the research sample (e.g. Harvard university undergraduates, villagers in rural India) and provide relevant demographic information (e.g. age, sex) and indicate whether the sample is representative. Provide a rationale for the study sample chosen. For studies involving existing datasets, please describe the dataset and source.
Describe the sampling procedure (e.g. random, snowball, stratified, convenience). Describe the statistical methods that were used to predetermine sample size OR if no sample-size calculation was performed, describe how sample sizes were chosen and provide a rationale for why these sample sizes are sufficient. For qualitative data, please indicate whether data saturation was considered, and what criteria were used to decide that no further sampling was needed. We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response.
whether the researcher was blind to experimental condition and/or the study hypothesis during data collection.
Indicate the start and stop dates of data collection. If there is a gap between collection periods, state the dates for each sample cohort.
If no data were excluded from the analyses, state so OR if data were excluded, provide the exact number of exclusions and the rationale behind them, indicating whether exclusion criteria were pre-established.
State how many participants dropped out/declined participation and the reason(s) given OR provide response rate OR state that no participants dropped out/declined participation.
If participants were not allocated into experimental groups, state so OR describe how participants were allocated to groups, and if allocation was not random, describe how covariates were controlled.
Briefly describe the study. For quantitative data include treatment factors and interactions, design structure (e.g. factorial, nested, hierarchical), nature and number of experimental units and replicates.
Describe the research sample (e.g. a group of tagged Passer domesticus, all Stenocereus thurberi within Organ Pipe Cactus National Monument), and provide a rationale for the sample choice. When relevant, describe the organism taxa, source, sex, age range and any manipulations. State what population the sample is meant to represent when applicable. For studies involving existing datasets, describe the data and its source.
Note the sampling procedure. Describe the statistical methods that were used to predetermine sample size OR if no sample-size calculation was performed, describe how sample sizes were chosen and provide a rationale for why these sample sizes are sufficient.
Describe the data collection procedure, including who recorded the data and how.
Indicate the start and stop dates of data collection, noting the frequency and periodicity of sampling and providing a rationale for these choices. If there is a gap between collection periods, state the dates for each sample cohort. Specify the spatial scale from which the data are taken If no data were excluded from the analyses, state so OR if data were excluded, describe the exclusions and the rationale behind them, indicating whether exclusion criteria were pre-established.
Describe the measures taken to verify the reproducibility of experimental findings. For each experiment, note whether any attempts to repeat the experiment failed OR state that all attempts to repeat the experiment were successful.
Describe how samples/organisms/participants were allocated into groups. If allocation was not random, describe how covariates were controlled. If this is not relevant to your study, explain why.
Describe the extent of blinding used during data acquisition and analysis. If blinding was not possible, describe why OR explain why blinding was not relevant to your study.
Describe the study conditions for field work, providing relevant parameters (e.g. temperature, rainfall).
State the location of the sampling or experiment, providing relevant parameters (e.g. latitude and longitude, elevation, water depth).
Describe the efforts you have made to access habitats and to collect and import/export your samples in a responsible manner and in compliance with local, national and international laws, noting any permits that were obtained (give the name of the issuing authority, the date of issue, and any identifying information).
Describe any disturbance caused by the study and how it was minimized.

March 2021
Materials & experimental systems Tick this box to confirm that the raw and calibrated dates are available in the paper or in Supplementary Information.

Ethics oversight
Note that full information on the approval of the study protocol must also be provided in the manuscript.

Animals and other organisms
Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research Laboratory animals

Wild animals
Field-collected samples

Ethics oversight
Note that full information on the approval of the study protocol must also be provided in the manuscript.
Describe all antibodies used in the study; as applicable, provide supplier name, catalog number, clone name, and lot number.
Describe the validation of each primary antibody for the species and application, noting any validation statements on the manufacturer's website, relevant citations, antibody profiles in online databases, or data provided in the manuscript.
State the source of each cell line used.
Describe the authentication procedures for each cell line used OR declare that none of the cell lines used were authenticated.
Confirm that all cell lines tested negative for mycoplasma contamination OR describe the results of the testing for mycoplasma contamination OR declare that the cell lines were not tested for mycoplasma contamination.
Name any commonly misidentified cell lines used in the study and provide a rationale for their use.
Provide provenance information for specimens and describe permits that were obtained for the work (including the name of the issuing authority, the date of issue, and any identifying information). Permits should encompass collection and, where applicable, export.
Indicate where the specimens have been deposited to permit free access by other researchers.
If new dates are provided, describe how they were obtained (e.g. collection, storage, sample pretreatment and measurement), where they were obtained (i.e. lab name), the calibration program and the protocol for quality assurance OR state that no new dates are provided.
Identify the organization(s) that approved or provided guidance on the study protocol, OR state that no ethical approval or guidance was required and explain why not.
For laboratory animals, report species, strain, sex and age OR state that the study did not involve laboratory animals.
Provide details on animals observed in or captured in the field; report species, sex and age where possible. Describe how animals were caught and transported and what happened to captive animals after the study (if killed, explain why and describe method; if released, say where and when) OR state that the study did not involve wild animals.
For laboratory work with field-collected samples, describe all relevant parameters such as housing, maintenance, temperature, photoperiod and end-of-experiment protocol OR state that the study did not involve samples collected from the field.
Identify the organization(s) that approved or provided guidance on the study protocol, OR state that no ethical approval or guidance was required and explain why not.

Human research participants
Policy information about studies involving human research participants Population characteristics

Recruitment
Ethics oversight Note that full information on the approval of the study protocol must also be provided in the manuscript.

Clinical data Policy information about clinical studies
All manuscripts should comply with the ICMJEguidelines for publication of clinical research and a completedCONSORT checklist must be included with all submissions.

Clinical trial registration
Study protocol

Data collection
Outcomes Dual use research of concern Policy information about dual use research of concern Hazards Could the accidental, deliberate or reckless misuse of agents or technologies generated in the work, or the application of information presented in the manuscript, pose a threat to: Patients in Canada who were diagnosed with metastatic pancreatic ductal adenocarcinoma and had not yet received treatment for their metastatic disease were referred to the PanGen trial by their treating oncologist. For the POG trial, patients were recruited based on the POG trial inclusion criteria (NCT02155621) which differed from PanGen as patients were not required to have not yet received treatment for their metastatic disease. CRA in participating centers approached patients who were potentially eligible for this study and patients were enrolled after they provided written informed consent and the eligibility criteria were confirmed. Participants were enrolled from the patients treated at the participating institutions, or referred for participation in a clinical trial. While participants were enrolled in public cancer centers in a univerally-funded healthcare system, those referred may not be representative of the broader patient population with metastatic PDAC.
This work was approved by and conducted under the University of British Columbia -BC Cancer research ethics board (H12-00137, H14-00681, H16-00291) and approved by the institutional review board and conducted in accordance with international ethical guidelines. Written informed consent was obtained from each patient upon study enrollment and prior to molecular profiling. All sequencing data were housed using a secure computing environment. In this manuscript, primary and secondary outcomes of the trial are not reported. Rather, outcome analyses were performed with a focus on specific genomic alterations present in a subset of patients.
nature portfolio | reporting summary

March 2021
Experiments of concern Does the work involve any of these experiments of concern: No Yes Confirm that both raw and final processed data have been deposited in a public database such as GEO.
Confirm that you have deposited or provided access to graph files (e.g. BED files) for the called peaks.

Data access links
May remain private before publication. The axis labels state the marker and fluorochrome used (e.g. CD4-FITC).

Files in database submission
The axis scales are clearly visible. Include numbers along axes only for bottom left plot of group (a 'group' is an analysis of identical markers).
All plots are contour plots with outliers or pseudocolor plots.
A numerical value for number of cells or percentage (with statistics) is provided.

Methodology
Sample preparation

Instrument
For "Initial submission" or "Revised version" documents, provide reviewer access links. For your "Final submission" document, provide a link to the deposited data.
Provide a list of all files available in the database submission.
Provide a link to an anonymized genome browser session for "Initial submission" and "Revised version" documents only, to enable peer review. Write "no longer applicable" for "Final submission" documents.
Describe the experimental replicates, specifying number, type and replicate agreement.
Describe the sequencing depth for each experiment, providing the total number of reads, uniquely mapped reads, length of reads and whether they were paired-or single-end.
Describe the antibodies used for the ChIP-seq experiments; as applicable, provide supplier name, catalog number, clone name, and lot number.
Specify the command line program and parameters used for read mapping and peak calling, including the ChIP, control and index files used.
Describe the methods used to ensure data quality in full detail, including how many peaks are at FDR 5% and above 5-fold enrichment.
Describe the software used to collect and analyze the ChIP-seq data. For custom code that has been deposited into a community repository, provide accession details.
Describe the sample preparation, detailing the biological source of the cells and any tissue processing steps used.
Identify the instrument used for data collection, specifying make and model number.